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RUI DILAO AND DANIELE MURARO 



Abstract. We present a general methodology in order to build mathematical 
models of genetic regulatory networks. This approach is based on the mass ac- 
tion law and on the Jacob and Monod operon model. The mathematical models 
are built symbolically by the Mathematica software package GeneticNetworks. 
This package accepts as input the interaction graphs of the transcriptional acti- 
vators and repressors and, as output, gives the mathematical model in the form 
of a system of ordinary differential equations. All the relevant biological param- 
eters are chosen automatically by the software. Within this framework, we show 
that threshold effects in biology emerge from the catalytic properties of genes 
and its associated conservation laws. We apply this methodology to the segment 
patterning in Drosophila early development and we calibrate and validate the 
genetic transcriptional network responsible for the patterning of the gap pro- 
teins Hunchback and Knirps, along the antero-posterior axis of the Drosophila 
embryo. This shows that patterning at the gap genes stage is a consequence of 
the relations between the transcriptional regulators and their initial conditions 
along the embryo. 



I. Introduction 

A genetic regulatory network is an ensemble of interactions in a biological process 
involving proteins, genes and mRNAs. The interactions between different proteins 
and genes can be done by transcriptional activation and repression at the level of 
the genes, by protein-protein interactions, and by protein-mRNA interactions. 

A genetic regulatory networks is described by a graph where vertices repre- 
sent genes, proteins, enzymes or other chemical substances. The edges represent 
transformations, e. g., phosphorylation and dephosphorylation, or activation and 
inhibitory actions through transcription regulators. 
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More precisely, a genetic regulatory networks is described by a double graph 
G = (V, E a , E r ), where V is the set of vertices or nodes of the graph and E a and 
E r are two sets of ordered pairs of vertices of the double graph. Each ordered pair of 
vertices defines the activation or the repression mechanism of the first node over the 
second. In classical graph theory G = (V, E a ) and G = (V, E r ) are two graphs with 
a common set of vertices. For example, in Figure [TJ we show the graph of a genetic 
network associated with the production of the proteins Bicoid (BCD), Hunchback 



(HB), Knirps (KNI) and Tailless (TLL) in Drosophila early development, (Alves 



|and Dilao[|2006[ ). In this example, we have V = {bed, Kb, BCD, HB, KNI, TLL}, 
E a = {(bcd,BCD),(hb,HB),(BCD,HB),(BCD,KNI)} and 
E r = {(HB, KNI), (KNI, HB), (TLL, KNI)}. 




TLL 



Figure 1. Graph describing the genetic network associated with 
the production of proteins Bicoid (BCD), Hunchback (HB), Knirps 
(KNI) and Tailless (TLL) in Drosophila early development. mRNAs 
hunchback and bicoid are represented by hb and bed respectively. 
Arrows represent activations and are listed in the set of ordered 
pairs E a . Lines with perpendicular endings represent repressions 
and are listed in the set E r . 



The graph of Figure [T] has a clear biological meaning. It expresses the fact that 
BCD is a transcriptional activator of both HB and KNI, HB and KNI proteins 
both repress each other, and TLL is a repressor of KNI. The vertices of the graph 
of Figure [T] can represent mRNAs, as in the case of hb and bed, or proteins, as in 



4 RUI DILAO AND DANIELE MURARO 

the case of BCD and TLL, or genes and proteins simultaneously, as in the case of 
HB and KNI. 

Here we propose a set of rules in order to construct the model equations as- 
sociated with a genetic regulatory networks described by a double graph G = 
(V, E a , Eb). This paper is an attempt to delineate a methodological approach for 
the construction of mathematical models of genetic regulation from the principles 
of chemical kinetics and chemical bound. In the literature, it is often found ex- 
amples of mathematical models of biological systems described by different sets 
of equations and characterized by different sets of parameters that are difficult to 
interpret and measure. Making qualitative predictions with these different models 
has a limited predictable value. For a review on the different approaches see, for 



Klipp et al. 


2005 


) and 


Jong 


(2002 



From the point of view of predictability, it is important that mathematical model 
could be derived from first principles, leading to a meaningful identification of the 
biological parameters, their calibration and validation with experimental data. 

In the construction of models for generic regulatory networks, we assume that 
models can be built with rate equations reflecting a mean field view of the stochas- 
tic random motion occurring at the molecular scale. This mean field approach, 
also called mass action law, is derived from the probabilistic collision laws occur- 
ring at the molecular scale. The models originated from this view are described by 



ordinary differential equations with polynomial vector fields, (van Kampen, 1992). 

One of the advantages of the mass action law approach is that the mean field 
rate equations have a direct microscopic interpretation, being associated with the 
collision mechanism that are in the origin of every reactive process. For model re- 
finement, fluctuations can also be studied through the corresponding master equa- 
tion. From the experimental point of view, microbiology techniques are strongly 
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anchored in the mass action law or mean field approach. For example, genome 
sequencing is only possible in the framework of the mean field approach. 

For genetic regulatory networks described by graphs with a large number of 
vertices, and a complex structure of edges, the rate equations describing the evo- 
lution in time of concentrations are in general difficult to build, and are critically 
dependent on the assumptions done about the biological and chemical interactions 
involved. During the development of these complex models, it is often necessary 
to test different configurations changing the parameters and the initial conditions. 
Writing by hand all of these information is both time-consuming and error-prone. 

In order to perform these tasks automatically, we have developed two Mathemat- 
ica software packages, Kinetics and GeneticNetworks, that execute the symbolic 
computations associated with the construction of the model equations for a ge- 
netic regulatory network. The result of the analysis is in symbolic form, and can 
be used in Mathematica or in C for further processing and numerical integration. 
The software packages GeneticNetworks and Kinetics are freely distributed from 
the site "https://sd.ist.utl.pt". 

The Kinetics package implements the mass action law in its polynomial exact 
form, computing symbolically the associated rate equations and conservation laws. 
The package GeneticNetworks implements a particular model for protein-gene reg- 
ulation. This model for the protein-gene regulation is based on the operon model 



of Jacob and Monod (Jacob and Monod, 1961), and its basic properties have been 



previously introduced in Alves and Dilao ( 2005 ) . 

This paper is organized as follows. In section [2] we briefly review the mass 
action law of chemical kinetics and we introduce the collision graphs associated 
with the mass action law. We derive the basic mass action rate equations. A 
special emphasis is done on mass action conservation laws, an important feature 
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that is in the very foundations of threshold effects in biology. In other approaches, 
threshold do not result as emergent phenomena, but must be imposed through 



"ad-hoc" regulatory functions (see for example (Klipp et al. 2005, pp. 237) or 



(Jong, 2002)). 



In section |3j we describe the mechanism of genetic regulation based on the 



Jacob and Monod operon model, (Jacob and Monod, 1961), and we introduce the 



modelling assumptions for the construction of the mathematical models of genetic 
networks described by double graphs. 

In section [4j we give an overview of the GeneticNetworks software package. In 
section [5j we show three different applications of genetic regulatory networks. In 
the first application, we show, with a very simple example of autoregulation, that 
the conservation law constant is a bifurcation parameter for the regulation model, 
inducing a threshold effect in the production of proteins. In the second example, 
we give a genetic regulatory network inducing a localized spiky pattern. In the 
third example, we analyze the experimental data associated with the KNI and 
HB inhibitory cross regulation in Drosophila early development, described by the 
double graph of Figure [TJ and we calibrate this model with the experimental data. 
In the final section, we summarize the main biological conclusions of the paper. 

2. The mass action law framework of chemical kinetics 

In general, an ensemble of chemical reactions is represented by the following 
collision diagram, 

(1) vnAi H h v im A m iHxM H h i~ti m A m 



where i — 1, . . . , n. The Aj, for j — 1, . . . , m, represent chemical substances, as for 
example, Aj = H2O. The constants and the stoichiometric coefficients, 
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in general, non-negative integers, and the constants are the rate constants. If 
= fi^ > 0, the corresponding substance Aj is a catalyst and, if /z^- > > 0, 
Aj is an autocatalyst. In the diagram ([I]), there are m chemical substances and n 
rate constants or chemical reactions. 

Under the hypothesis of homogeneity of the solution where reactions occur, 
the mass action law asserts that the time evolution of the concentrations of the 
chemical substances is described by the system of ordinary differential equations, 

r\ A " 

(2) ^f = Y. r ^-v*M u i 1 ---AZ m 

1=1 

where j = 1, . . . , m, and we use the same symbol to represent both the chemical 
substance and its concentration. The rate equations ^ are derived under the 
following assumptions: (i) chemical reactions, when they occur, are due to elastic 
collisions between the reactants, (ii) homogeneity of the reacting substances in the 
solution, and (iii) thermal equilibrium of the solution. All the kinetics aspects 
related with the dependence of the velocity of the reactions on the temperature 



or pressure are contained in the rate constants rj. For details see van Kampen 



fll992p . 

At the atomic and molecular scale, chemical reactions between molecules can 
occur only if molecules collide or approach each other to small distances where 
bounding forces become meaningful. These chemical bounding forces are of elec- 
trical or quantum origin, and at distances larger than the mean free path they 
become less important when compared with the kinetics associated with the molec- 
ular motion. As chemical reactions only occur if the chemical substances involved 
collide, the vector fields associated with the right hand side of ^ are in general 
quadratic, representing binary collisions. Higher order polynomial vector fields are 
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possible but, at the microscopic level, they are associated to triple or higher order 
collisions, a situations that occurs with a very low probability. Therefore, we will 
restrict our examples to models with two-body collisions. 
The equations (|2| can also be written in the matrix form, 

dA 



(3) 



dt 



Tlu(A) 



where r is a n x m matrix, A T = (Ai, . . . , A m ), and, 

00(A) = \ 

\ r n (fi n j — v n j)A v {" v ■ ■ ■ A^ m J 

In general, n 7^ m, and the equations in system ^ are not all independent. Let 
us denote by r the rank of the matrix T. The dimension of the null space of T 
relates with its rank by, diu^iVw/ZfT)) + r = m (number of columns of V). Let 
vi, . . . , v m -r be a basis of the Null space of T, then, Tvk — 0, for k = 1, . . . m — r. 
So, by ([3]), we have, 



(4) 



Hence, associated to the differential equations (|2]), we have the conservation laws, 



(5) 



A.Vk = cons 



where, k = 1, . . . m — r. 

The Mathematica software package Kinetics calculates the rate equations (J5]) 
describing the time evolution of the concentrations of the substances involved in 
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the reactions described by the collision diagram ([I]). The package calculates also 
the corresponding conservations laws 

The input of the package is the ensemble of chemical reactions, and the output 
of the package is the set of differential equations derived by the mass action law. 
The output can be later analyzed and studied by the analytical and numerical 
tools provided by Mathematica. In order to avoid long development times, the 
names of the rate constants are chosen automatically by the program. 

The package Kinetics has the usual help commands, and we provide the Mathe- 
matica notebook Kinetics Test, nb with several self-explanatory examples and com- 
putations. 

For example, let us now describe a simple protein production model with Ki- 
netics. The molecular biology dogma asserts that genes are the templates for 
protein production, and the standard mechanism for protein production can be 
represented by the collision diagrams, 



Gene + Polymerase Gene + Polymerase + mRNA 



(6) 



mRNA mRNA + Protein 

mRNA 

Protein 



Using the symbols G, Pol, R and P to represent gene, polymerase, mRNA and 
protein concentrations, respectively, the collision equations are the input for 
Kinetics, with the syntax, 

input={G + Pol^G + Pol + mRNA, mRNA -> mRNA + P, mRNA -> 

W1,P -> W2} 



where Wl and W2 are waist products. 
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For the collision mechanism ([6]), the rate equations for the protein concentration, 
and calculated by the package Kinetics, are, 

G' = 
Pol' = 

(7) 

R> = r x G.Pol - r 3 R 
P> = r 2 R - r 4 P 

and the rate constants have been chosen automatically by the software package. 
In this model, genes and mRNAs are catalysts, and these equations have the exact 
solutions, 

G(t) = G(0) 
Pol(t) = Pol(0) 

R(t) = R(0)e- r ^ + ^' (l - e~^) 
P(t) = G(0)Pol(0)^ 4 

I ( p(Q\ , fi(0)r 2 r 4 -r 1 r 2 G(0)PoZ(0) \ r4 t 

\ ^ ' r 4 (r 3 -r 4 ) J 

i J R(0)r 2 r3-rir2G(0)Po;(0) -r 3 t 
T3(r4—r3) 

To simplify the model equations of protein production and maintaining the 
catalyst properties of genes, in the following, instead of the collision mechanism 
([6]), we use the simplified or reduced kinetic mechanism, 



(9) 



Gene — 1 -+ Gene + Protein 



Protein 



S2 
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To the reactions ^ correspond the rate equations, 

a = o 

(10) 

p> = 8x G - s 2 P 



The rate equations ( 10 ) have the solutions 



G(t) = G(0) 



P(*) = G(0)a + (p(0)-G(0)g)e-«« 



Comparing the protein solutions in (11) and (ph, we conclude that the steady 



state of protein of both models is unique and is proportional to the gene concen- 
tration. The proportionality constant is different for both models, depending on 
the rates of the reactions involved. The steady state G(0)^ of model (J9| has a 
direct biological meaning: si is the rate of protein production and s 2 is the rate of 
protein degradation, and G(0) is the initial gene concentration. 

In the following, and in order to simplify the description of the transcriptional 
regulation of proteins, we will adopt the mechanism (j9j) in order to describe protein 
production. 

In both rate equations (10) and ([7]), the concentration of gene is constant along 



time, and therefore the gene concentration is a conservation law. In the following, 
we will show that the linear conservation laws of the form (J5j, will have an impor- 
tant role in the determination of steady states and in bifurcations associated with 
threshold effects. 
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3. A MASS ACTION FRAMEWORK TO DESCRIBE GENETIC REGULATORY 
NETWORKS BASED ON THE PROTEIN-GENE INTERACTION 

In the previous section, we have described a basic model model for the produc- 
tion of proteins, in the framework of the mass action law. Based on this framework, 
we generalize this view in order to include the case of transcriptional regulation 
by proteins. 

In order to keep general the approach presented here and to maintain the biologi- 
cal reality of model parameters, we make the following basic modeling assumptions: 

1) In order to describe quantitatively the protein production (concentration) 
within the molecular biology dogma, we only consider genes and proteins. Inter- 
mediate substances in the regulatory mechanism like catalysts and mRNAs are 
not considered. A model of protein production has been presented and analyzed 
at the end of the previous section. 

2) The regulation of protein production by the template gene is based on the 



Jacob and Monod operon model (Jacob and Monod, 1961 ). Namely, every gene has 



associated a certain number of binding sites where transcription factors can bind 
- activators or repressors, Figure [2] The regulation of activations and repressions 
occurs only through the binding sites. For a given double graph of interactions, 
the number of binding sites of a gene is determined by the number of edges that 
end up in the corresponding graph node. 

3) Transcription factors are the proteins associated with the vertices that acti- 
vate or inhibit the production of other proteins. The vertex of a graph represent 
a transcription factor only if it is the initial point of a edge of activation or in- 
hibition. If a vertex has incoming and outcoming edges of any type, this vertex 
represents a protein and a gene with several binding states. 
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4) Each transcription factor has its own binding site in the gene strand, or 
each gene has only one binding site for all the regulators. Both cases are treated 
separately in the model. We assume that when at least one activator is bound to 
a gene, the transcription is activated with a particular production rate for each 
combinatorial possibility of all the remaining binding sites. 

For example, in the double graph of the biological mechanism of Figure [TJ we 
have the following chemical substances, 

bed, hb mRNAs 

BCD, HB, TLL proteins 

(12) HBq, HBf™ EB° Km , HBS operons 

KNI° , KNI^, KNI°^ , KNI^ operons 

KNI° TLi , KNIg£f L , KNI% iTLL , KNl* c B n LL operons 

The description of the time evolution of protein concentrations of the mechanisms 
of Figure [I] involves one rate equation for each substance in (12), except eventu- 



ally for bed and hb. As proteins are produced from a gene template, the symbol 
associated to each vertex of the graph represents a protein. The operon states are 
represented by the same symbol with superscripts and lowerscripts. The super- 
scripts positions indicate the binding or unbinding of transcriptional activators. 
The lowerscripts positions indicate the binding or unbinding of transcriptional 
repressors. 

The model associated with a given double graph contains the rate equations for 
the proteins and the operons in its different states. We also assume by default that 
proteins always degrade and genes are autocatalytic substances that never degrade. 
The first assumption implies that protein concentrations remain bounded in time, 
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regulator 
binding sites Gene 

nnm 

v v 



operon 



FIGURE 2. Jacob and Monod operon model for the regulation of 
protein production. The transcription is regulated by the activators 
and the repressors binding to the binding sites of the gene. Figure 



adapted from (Alves and Dilao, 2005). 



and the second assumption implies the existence of a conservation law for the 
concentration of the operon states. 



4. The GeneticNetworks software package 

GeneticNetworks is a software package that generates the rate equations for the 
concentrations of genes and proteins in a regulatory network. The starting point 
is the double graph of activations and repressions, G = (V, E a ,E r ). The inputs 
for GeneticNetworks are two strings, activation and repression, that describe the 
transcriptional activations and repressions of proteins on genes. In the graph, 
the same symbol is used to denote both a gene and the corresponding produced 



protein. As we have seen in (12), the set of variables for the regulation model is 
constructed with the vertex symbols and the associated operon states. 
For example, using as input for GeneticNetworks the interaction strings, 

activation = {A — > B\ 

(13) 

repression = {R — ► B} 



the double graph of the genetic network (13) is shown in Figure |3| In this case, 
the double graph G = (V, E a , E T ) is characterized by the sets, V = {A, B, R}, 
E a = {(A,B)} and E r = {(R,B)}. 
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V 



B 

Figure 3. Double graph associated with the input strings (13) for 
the GeneticNetworks software package. 



In the interaction mechanism (13), protein A activates gene B, and protein R 



represses gene B. Therefore, the variables of the mechanism (13) are 



(14) 



A, B, R proteins 
B°, B$,B% B^ operons 

The following functions are defined in the GeneticNetworks package: 



• NetworkGraph, ManipulateGraph 

• Reactions, ReactionsOneSite, ReactionGraph 

• SubstanceNames, SubstanceVariables, 
SubstancelnitialConditions 

• ParameterNames, Parameter Input 

• Equations 

• ConservationLaws 



With these functions, we calculate the model equations associated with the input 



strings (13), calculate automatically the number of variables of the model, define 
all the relevant parameters and calculate the rate equations. For example, to 



the genetic regulatory network (13), the GeneticNetworks package associate the 
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collision diagrams, 



A+B 0^ B A 

b-a 

b r 

R + B B R 

b— r 

b r 



R + B. 







(15) 



b- r 

ba 



A + B\ 



B 



B A ^ B A + B 



B 



B 



?B A R + B 



To these collision diagrams, we have the mass action law rate equations, 



(16) 



A' 


= —b a A.B R 


+ b^Bi 


-b a A.B° 


+ b_ a B A 


B' 


= (.Pr) b Br 


+ W) B 


B A — d B B 


B! 


= — b r R.B A 


+ b_ r B A 


- b r R.B° 


+ b_ r B° R 


(B° )' 


= -b a A.B° 


+ b- a B$ 


— b r RB^ - 


1- b- r B° R 


wr 


= b a A.B° - 


b- a B A - 


b r R.BQ -f 


b- r B R 


(B° R )' 


= -b a A.B° R 


+ b-aBi 


+ b r R.Bl 


°-r-DR 




= b a A.B R — 


b-a,B R + b r R.BQ - 


- b_ r B^ 



and the conservation law, 



(17) 



B° (t) + B A (t) + B R (t) + B A (t) 
= fi °(0) + B A (0) + B R (0) + B A (0) 



From the conservation law (17), we can eliminate one of the equations in (16). In 
this genetic network, we have assumed that the protein concentrations of A and 
R are constant along time. 



EMERGENT THRESHOLDS IN GENETIC NETWORKS 17 



The rate equations (16) define a mass action law based model for the genetic 
regulatory network of Figure [3] 

In the implementation of GeneticNetworks, we have two possible modeling choices. 
In one choice, each different regulator has its own binding site, and the model dia- 



grams (15) have been constructed with this assumption. For the second choice, we 



consider that there is only one binding site in the operon where all the regulators 



bind. In this case, the collision diagrams associated with the genetic network (13) 
and calculated in the GeneticNetworks package are, 







b a 




A H 




b-a 
b r 


B a 


R- 


-B 


b— r 


Br 


B A 


Vb 


B A 


+ B 


B - 


dg 







By the mass action law, the ReactionsOneSite command leads to the rate equa- 
tions, 



(19) 



A' 


= b_ a B A -b a A.B 


B' 


= p B B A - d B B 


R' 


= b- r Bf{ — b r BQ.R 


(Bo)' 


= -b a A.B + b_ a B 


(Ba)' 


= b a A.B -b_ a B A 


(Br)' 


= b r BQ.R — b- r Bji 



The equations (18) have the conservation law, 



(20) 



B (t) + B A (t) + B R (t) = B (Q) + B A (0) + B R (0) 
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The two models ( 15 ) and ( 18 ) for the genetic network ( 13 ) are different and these 
two choices are implemented in the GeneticNetworks package. For the dynamical 



analysis of a particular case of the distinction between the two models (15) and 



(18), see (Alves and Dilao 2005). Below, we will show with a specific example 
that these two different regulation choices lead to qualitatively and quantitatively 
similar results. 



5. Applications 

5.1. An emerging threshold in the dynamics of a self-activating protein. 

As an application of the rules describing a genetic regulatory network just intro- 
duced, we discuss now the basic role of the conservation laws in the occurrence of 
threshold effects in regulation mechanisms. We study the case of a self-activating 
protein, where the produced protein activates its own production, Figure [4} 




Figure 4. Regulation graph describing a self-activating protein. 

The simplest self-activating genetic network is described by the input tables, 

activation = {A — > A} 
repression = {} . 
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The reactions and the parameters involved in this activation can be obtained by the 
GeneticNetworks command Reactions, followed by the command ReactionGraph, 



(21) 



A + A <=> A A 

A AVA^ A A + A 

A — ^ 



where A and A A are operon states and A is the corresponding protein. 
With the command Equations, we get the rate equations, 



(22) 



A' = a a A.A° + a_ a A A + p A A A - d A A 
(A )' = -a a A.A° + a. a A A 
(A A )' = a a A.A°-a„ a A A 



Finally, with the command ConservationLaws, we find, 



(23) 



A°(t) + A A {t) = A°(0) + A A (0) = c 



where c is a constant. 



Introducing (23) into (22), the independent set of rate equations describing the 



process (21 ) is, 



(24) 



A' = a a A.A° + a. a (c-A )+p A (c-A°)-d A A 
(A )' = -a a A.A° + a„ a (c - A°) 



We analyze now the steady state and the phase space structure of the solutions 



of equations (24). Equations (24) have two steady states with coordinates, 



(4 ) ,A (1) ) = (c,0) 
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and, 

(A° {2) ,A {2) ) = 



d A a_ a cp A a a - d A a_ a 



p A a a d A a a 



As the coordinate of the two fixed points are dependent of c, by (23), the steady 



state coordinates are dependent of the initial concentrations of the operon. 



Let J(j), i = 1,2 be the Jacobian of equation (24) evaluated at the fixed points. 
As, 

det J(i) = — det J( 2 ) 

and, 

det J(i) < a_ a d A < ca a p A 

then, we have, 



(yi^s, is of saddle type 
(A® 2 yA(2)) is a stable node 

(A^, A(i)) is a stable node 
(A9 2 n,A( 2 )) is of saddle type 

As, for c < a_ a d A /a a p A , A( 2 ) is negative, the protein concentration at the steady 



(25) if a_ a d A < ca a p A 



(26) if a- a d A > ca a p A 



state of the rate equations (24) is zero (^4(i) = 0). For c > a_ a d A /a a p A , the protein 
concentration at equilibrium is A^) — cpAa ^~^ Aa ~" ■ Therefore, the conservation law 



(23) tune a bifurcation for c = a_ a d A /a a p A (transcritical bifurcation), implying the 
existence of a threshold effect tuned by the conservation law parameter. 

In Figure |5j we show the dependence of the protein steady state on the total 
concentration of the gene. 



5.2. Spatial distribution and steady states. We have focused on genetic reg- 
ulatory models without specifying a spatial localization. In a genetic network 
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Figure 5. Dependence of the protein steady state on the to- 
tal concentration of the gene. Below the bifurcation or threshold 
c = a-a^Al a aVAi the equilibrium value of protein concentration A eq 
is zero, while above it takes the value A eq = (pAca a — dA a ~a) I '{dA a a) ■ 
The parameters are: a a = 1.0, a_ a = 0.1, pa = 0.2 and = 0.1. 



describing some biological process, the initial concentration of proteins can sig- 
nificantly vary in space, across tissues. This is specifically true in developmental 
processes, where often proteins show a spatial distribution with very sharp slopes 
that play a basic role in the establishment of body plans of organisms. A well 
known case is the Drosophila segmentation, where variations on protein concentra- 



tions across the embryo induces protein patterning, (Driever and Niisslein-Volhard 



1988; Niisslein-Volhard] |1992 ). One of such genetic regulatory networks is the one 



represented in Figure [TJ ( |Alves and Dilao , 2006). 

To show that patterning can be explained by the non homogeneity of initial 
conditions of regulators across tissues, we analyze a genetic regulatory network 
for the production of a protein B, regulated by one activator A and two repressor 
proteins R and S, Figure [6j To simplify our analysis, we take the competitive 
case, where the activator and the repressors bind to the same binding site of the 
operon of protein B. 

To simplify further, we assume that the spatial distribution of the proteins A, 
R and S are constants. Under these conditions and with the regulation model 
developed in the package GeneticNetworks, we obtain the system of linear rate 
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Figure 6. Genetic regulatory network for the production of protein 
B with one activator A and two repressors R and S. 



equations, 

B' = p B B A -d B B 

, . (B A )' = b a AB -b_ a B A 

(27) 

(Br)' = b r RB — b- r B R 

(B s )' = b s SB -b_ s B s 

where B = B(x,t), B = B (x,t), B A = B A (x,t), B R = B R (x,t), B s = B s (x,t), 
A = A(x), R = R(x) and S = S(x). These concentration variables are considered 
to be defined in a spatial one-dimensional bounded region of the real line (sGlC 
M.). The following conservation law holds, 

B (x, t) + B A (x, t) + B R (x, t) + B s {x, t) = c(x) 

where c(x) is a constant, depending eventually of the spatial independent coordi- 



nate x. The system of rate equations (27) has one steady state with coordinates 



5 PbB a - cAb a b_ r b- 

t> = — ; , E>A 



where, 



d B ' A D 

- cRb„ a b r b_ s - cSb^ a b_ r b s 
B R = , B s = 



D = b- a b^ r b- s + Ab a b^ r b- S + Rb- a b r b_ s + Sb- a b^ r b s 
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Choosing b a = b r = b s , and 6_ a = 6_ r = &_ s , the steady state concentration of the 
protein B, is, 

(9R) R(r) = c(x)A(x)p B 

1 ; { ! {l + A(x) + R(x) + S(x))d B 

In Figure [7| we show the steady state concentration (28) of protein B, as a func- 
tion of a spatial coordinate, x G [0, 1]. We have considered the initial distributions 
A(x) = 0.8, R(x) = 83e~ 7x , S{x) = 83e~ 7 ^- x \ c(x) = 100, and the parameter 
value Pb/cIb = 1- For this case, the steady distribution of protein B is spiky, due 
to the inhibitory regulation of the repressors. We have analyzed the same genetic 
network of Figure [6] with a model with different binding sites for each regulator. 
The final result is similar with the one shown in Figure [7| This shows that the 
two model approaches in GeneticNetworks, with one binding site and with several 
binding sites in the operon, give similar qualitative results. When, the calibration 
and validation of models is not a problem, we can use the simplest one binding 
site regulation model in order to describe a given genetic regulatory network. The 
one-regulator site framework leads to simpler mathematical models. 



The solution ( 28 ) shows that steady states can depend on the initial conditions of 
the regulators (A(x), R(x) and S(x)) and on the initial conditions of the catalytic 
agents (c(x)), showing that spatial patterning can be a direct consequence of the 
non homogeneity of initial conditions. 

In this model, we have considered that the concentrations of A, R and S are 



constants, implying that the model equations (27) describe a system where A, R 
and S have a fast recovery time. This situation only occurs in thermodynamically 
open systems, as is the case of biological systems. 
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Figure 7. Steady states of proteins B, R and S for the genetic 
regulatory network of Figure [6j The steady state of protein B shows 
a spiky profile, resulting from the inhibitory action of proteins R 
and S. In this model, we have considered that the concentrations of 
R and S are constant in time and non-homogeneous in space. The 
activator protein A has been considered constant along the spatial 
region. 



5.3. Cross-regulation in Drosophila. In Drosophila early development, some 
proteins as Bicoid (BCD) and Hunchback (HB) are translated from mRNA of 
maternal origin. Early in the first developmental stages of Drosophila, at cleavage 
stage 13, BCD and HB proteins form a stable gradient along the antero-posterior 
axis of the Drosophila embryo, Figure |8l This stable gradient has it origin in a 



diffusion process occurring in the syncytial blastoderm of the embryo, (Dilao and 



Muraro, 2009). At a latter stage, in the 14th cleavage stage, other proteins as 



Knirps (KNI) show segments characterized by spiky concentration patterns along 
the antero-posterior axis of the embryo. 

We show now that the patterning HB and KNI proteins as observed at late 
cleavage stage 14 of the embryo of Drosophila is due to the non homogeneity of 
protein distribution at an early developmental stage. For that, we have taken the 
genetic regulatory network of Figure [TJ describing the genetic regulation of HB and 
KNI, and we have used the package GeneticNetworks to describe the production 
of proteins Hunchback and Knirps during the cleavage cycles 14 of the developing 
embryo of Drosophila. 
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Figure 8. Concentration of protein Hunchback (HB) at the end 
of cleavage cycle 13, and of Bicoid (BCD) and Tailless (TLL) pro- 
teins at the cleavage cycle 14, along the antero-posterior axis of the 
Drosophila embryo. The embryo length has been scaled from to 
100, and the vertical axis units are proportional to protein concen- 
tration. The continuous curves represent the mean distribution of 
the concentration of proteins calculated for the data of 954 embryos. 



The data has been taken from the FlyEx database (|Poustelnikova 
2004; 



et 



al. 



|Pisarev 



et 



al. 



ht t p : / / flyex . ams . suny sb . edu / flyex / ) . 



2009 



At the end of cleavage cycle 13 and beginning of the 14th, BCD, HB and TLL 
proteins have a non homogeneous distribution along the embryo, Figure [8j 

Hunchback and Knirps proteins are both activated by the maternally produced 
protein Bicoid, Figure [TJ and they mutually repress each other. The protein Knirps 
is also repressed by the protein Tailless of maternal origin. Therefore, the genetic 
network model obtained with the package GeneticNetworks should lead to the 
experimental profiles of HB and KNI, as observed at cleavage cycle 14, Figure |9j 

In Figure [9j we show the experimental profiles of HB and KNI at cleavage cycle 
14, as well as a fit of the experimental data obtained with a model built with the 
software package GeneticNetworks. The model equations have 22 free parameters. 
To fit the experimental data with the mathematical model, we have assumed that 
the initial protein concentrations of BCD and TLL are constant over time, and we 
have assumed the option of several binding sites per operon. To find the numerical 
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values of the model parameters in order to calibrate the model, we have used an 



optimization techniques based on a genetic algorithm, (Dilao et al. , 2009 ; Dilao and 



Muraro, 2009b). We have considered that, in the ordinary differential equations of 



the model, the time is also a free parameter. The protein profiles shown in Figure [9] 
are out of steady state pattern, obtained for the integration time t = 805 s. 




Figure 9. The dots with error bars are the mean experimental 
profiles of the proteins HB and KNI at cleavage cycle 14, for several 
Drosophila embryos along the antero-posterior axis. The embryo 
length has been scaled from to 100, and the vertical axis units are 
proportional to protein concentration. The lines show the prediction 
of the model profile for proteins HB and KNI, obtained with the 
software package GeneticNetworks. The mean distribution of the 
concentration of proteins has been calculated for the data of 954 
embryos, taken from the Fly Ex database (iPoustelnikova et al. 2004 



Pisarev et al, 2009, [http:/ /ffyex.ams.sunysb.e du/flyex/). 



The fitted curves in Figure [9] shows a very good agreement with the experimental 
data. HB fitts well in the embryo length range [5%, 80%], and KNI fittes well in 
the embryo length range [20%, 100%]. The quality of the fits has been measured 
by the penalized chi square test. For the two fits in Figure [9j we have obtained 
the reduced chi square values Xhb = 0.57 and Xkni = 0.68. This validates the 
genetic regulatory network of Figure [I] This suggest that the anterior and posterior 
regions of the embryo are under the control of additional regulators. 
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6. Conclusion 

We have presented a software tool to build mathematical models of genetic 
regulatory systems. The input of the package is the graph containing the list of 
transcriptional activators and repressors of the network. The software implements 
an approach based on the mass action law, on the operon model for regulation, and 
we have assumed in general that genes are always catalytic substances presented 
in any genetically controlled biological process. 

Within this approach, the usual threshold concept in biology emerges as a bi- 
furcation phenomenon of the model equations. These bifurcations are in general 
tuned by the conservation law constants of the model equations. Therefore, within 
this framework, thresholds are emergent phenomena and they result from the cat- 



alytic role of genes, (Tyson, 2007). 



Another consequence of the modeling approach presented here is that positional 
information in developmental processes can be described by non-homogeneity in 
the spatial distribution of regulators, and not necessarily associated with other 
physical processes of transport and diffusion. 

Finally, we have validated a genetic regulatory network for the production of 
Hunchback and KNI during the 14th cleavage stage of the embryo of Drosophila, 
and we have presented evidence that gap gene protein segments are out of equi- 
librium patterns. The genetic regulatory network of Figure [T] describes well the 
gap gene protein concentration of HB in the embryo length region [5%, 80%], as 
well as the gap gene protein concentration of KNI in the embryo lenght region 
[20%, 100%]. 
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